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ON THE STOCHASTIC DISSEMINATION 


OF FAULTS IN AN ADMISSIBLE NETWORK 
by A. Kyrala 


1 • INTRODUCTION 


It is intended to discuss the dynamic distribution of faults in a general 
type of network to be defined in the next section which will be designated as 
"admissible." The starting point is a UNIQUELY BRANCHED NETWORK in which each 
pair of nodes is connected by a single branch. Later, the extension to 
MULTIPLE BRANCHED NETWORKS in which the formerly unique branches are replaced 
by two or more branches each will be discussed under the subject of REDUNDANCY 
IN NETWORKS Sec. 11 . 

The basic discrete model used here is the MARKOV CHAIN, although the 
extension to a SEMI-MARKOV chain will be discussed later. 

2. NETWORK MODIFICATION 


It will be supposed that there exists a discrete clock time universal for 
the entire network with a fundamental time interval t and that each branch 
transit time is an integer multiple of t. In an arbitrary network, this may 
be approximated by inserting additional new (bipolar) nodes into branches with 
original transit times larger multiples of t. If a network (in original form) 
is such that a signal can be delayed by a multiple of t at a node, this delay 
is equivalent to a zero delay at the node followed by insertion of an 
appropriate number of bipolar nodes into the output branches of the node. 
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Generally, the actual network to be treated will be unlike the uniquely 
branched network, which is the starting configuration to be analyzed, however 
the actual network can be obtained from the uniquely branched one by deletion 
of branches, node insertions and the addition of redundant parallel branches. 
The so modified network will be called an ADMISSIBLE NETWORK. 

3. MARKOV CHAIN 


The general n-node uniquely branched network (which is not necessarily 
two-dimensional) has a (triangular) number of branches given by 

A , = n(n-1)/2 (3.1 ) 

n- 1 

Let p. denote the absolute probability that a signal has reached the jth node 
J t 

at epoch 1 t. Supposing that sufficient bipolar nodes have already been 
inserted so that each time interval t represents a possible transition period 
between adjacent nodes, let a. denote the conditional probability of transit 

1 J t 

from node j to node i (i.e., through branch j to i ) during the time 
interval (t,t+x) contingent upon the signal having attained node j at 
epoch t. 

The post-transition probability of occupancy of node i at epoch (t+x) is 

then taken to be the linear homogeneous combination of the pre-transition 

2 

probabilities of occupancy given by the following expression 

P i I (t+x ) = Z a ijt p jt (3 ' 2) 

J 

for i=l to n subject to the principle of causality (for each epoch t) 
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I a jU = 1 (3.3) 

which sets the direction of time and arranges for the transition matrix to have 
columns summing to unity. It is also supposed that the components of each 
occupancy vector sum to unity in keeping with its stochastic interpretation. 

Multiplication of (3*3) by p and subtraction from (3.2) then yields 

P i I (t+x ) P it 1 ( a ijt p jt ' a j it p it } (3,4) 

J 

for i=l to n giving the change in occupancy probability as a sum of 

differences between absolute probabilities of transition into and out of i. 

One notes in passing that ( 3 .*0 exhibits the sufficiency of the Principle 

of Detailed Balance (with absolute, not conditional probabilities of 

transition) to ensure stationarity of the Markov chain characterized by the 

vanishing of the left side of ( 3 • ^ ) . 

A System for which (3.2) and (3*3) hold is called a MARKOV CHAIN and 

includes as special cases the Fermi-Dirac and Einstein-Bose statistics, the 

Diffusion equation, the Boltzmann transport equation as well as (in complex 

3 

generalization) the Schroedinger and Dirac equations of Quantum Mechanics. 

A concrete example of such a Markov chain is afforded by a system which 
possesses only two states "operative", designated by the subscript o or 
"inoperative", designated by the subscript i. Suppose that the system 
undergoes transitions between these states for a very long time. Each 
transition is characterized by the chain equations 


oo 


+ a oi P i 


" a io p o + a ii P i 


(3.5) 


(3.6) 
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and the causality conditions, 


a + a. 
oo 10 


(3.7) 


a . + a. . =1 

Ol 11 


as well as the normalization condition 


p o + p i " 1 


(3.8) 


(3.9) 


with the + indicating post-transition absolute probabilities and 

p Q = absolute pre-transition probability that system is operative 
p^ = " " " " " " inoperative 

a Qo = conditional probability that system remains operative after the 
transition CONTINGENT upon having been operative before the transition 
a Q ^ * conditional probability that system becomes operative after the 
transition CONTINGENT upon having been inoperative before the transition 
a^ o = conditional probability that system becomes inoperative after the 
transition CONTINGENT upon having been operative before the transition 
a„ = conditional probability that system remains inoperative after the 
transition CONTINGENT upon having been inoperative before the transition 

Under stationary conditions (after a great many transitions) the + may 
be removed (i.e., no further change in absolute probabilities occurs) so that 


p * a p + a . p. 
o oo o 01 1 


P i * a i0 P o + a ii P i 


( 3 . 10 ) 


(3.1D 
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Using the causality conditions (3-7), (3.8), one then concludes that detailed 
balance holds for the absolute probabilities of transitions between distinct 
states. Thus, 


a. p = a . p. 
10 o oi i 


( 3 . 12 ) 


Hence , 


p o /p l ■ 


a . /a. 
oi 10 


Adding 1 to each side this yields 


(3.13) 


P i 


a. /(a. + a . ) 

io io oi 


and 


a . / (a. + a . ) 

oi io oi 


(3.1*0 


(3.15) 


The fraction of transitions during which the system is operative is given by 


N /(N +N. ) = a . / (a 
O O 1 01 oi 


+ a . ) = p 

io o 


( 3 . 16 ) 


while the expected number of transitions for recurrence of the inoperative 
state is given by 

N(inop— ► inop) = 1 /p^ = (a^ + a^ Q )/a^ o (3-17) 

and the expected number of transitions for recurrence of the operative state is 
given by 

N(op— ► op) = 1/P Q = (a Qi + a io )/ a oi 


(3.18) 
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The expected number of transitions for the first passage from inoperative to 
operative state is given by 

N(inop-+ op) = 1/a (3.19) 

based upon the assumption that the contingency of starting inoperative was 
actually fulfilled. Similarly, the expected number of transitions for first 
passage from operative to inoperative state is given by 

N(op~> inop) = 1/a iQ (3-20) 

based upon the assumption that the contingency of starting in the operative 
state was actually fulfilled. 

Thus, all of these quantities may be expressed in terms of the conditional 
probabilities of transition subject to the assumptions stated. 

4. FAILURE-RELATED INTERPRETATION OF TRANSITION MATRICES 

For a uniquely branched network each off-diagonal element a„^ of the 

transition matrix corresponds to the traversal of the j to i branch in the 

specified direction. If a particular branch is deleted, BOTH terms a. .. AND 

l j t 

a. symmetrically located with respect to the main diagonal must be set equal 

Jit 

to zero. Also, if the branch connecting the ith and jth nodes fails 
BIDIRECTIONALLY, the same two terms must be set equal to zero. In a network 

with UNIDIRECTIONAL branches (say, j to i ) only one of the two symmetrants will 
be non null and this must be set equal to zero. It is in this way that the 
elements of the transition matrix are related to BRANCH FAILURES. 
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It is not difficult to construct matricial operators which remove elements 

from a matrix. Let E^ denote a matrix with a unit element at i,i on the 

main diagonal and zeros for all other entries. Then for a given transition 

matrix A, the matrix . A E^j is a matrix in which the element a„ is 

unaffected, but all other elements are reduced to null. Hence, A - E.. A E.. 

i i JO 

is a matrix which has identical elements as A except for a^ , which is 
replaced by zero. This matrix might reasonably be termed a BRANCH ANNIHILATOR. 

The diagonal elements a^ t are associated with NODAL DELAY of signal at 
node i at epoch t. If there is no nodal delay this diagonal element is null 
at epoch t. 

What of the less commonly treated case of NODAL FAILURES? Here it becomes 
a question of what constitutes a "nodal failure". A given row of the 
transition matrix (except for the diagonal element) is associated with all 
INPUTS to the node of the same row number. A given column of the transition 
matrix is associated (except for the diagonal element) with all OUTPUTS from 
the node of the same column number. If by NODAL FAILURE is meant (1) failure 
of all outputs, or (2) failure of all inputs, or (3) failure of all outputs AND 
all inputs then all off-diagonal elements of the (1) column, or (2) row, or (3) 
column AND row with the same number as the node must be set equal to zero. In 
a more elaborate definition of nodal failure, subsets of these entities could 
be annihilated. 

5. VECTOR -MATRICIAL FORMULATION OF THE MARKOV CHAIN 

The occupancy probabilities for epoch t may be conceived as components 
of a STATE VECTOR p. while those at epoch (t+x) are the components of a state 

L 
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vector p t+T and these two state vectors are related by the transition matrix 
A fc for epoch t 

P fT " A t P t (51) 

All of these occupancy vectors are in the first n-tant since all components of 
the (n- dimensional ) vectors are non-negative. The vector symmetrically 
directed with respect to the n orthogonal axes has a transpose (row vector) 
given by 

u T = (1,1,1... .1,1) ( 5 . 2 ) 

with all components defined to be unity and T denoting transpose. The 
normalization of occupancy probabilities then requires that 

u T .p t =1 ( 5 . 3 ) 

for all epochs t. A state vector of equal likelihood each of whose components 
is 1/n may also be constructed. It should be clearly recognized that the 
above condition does not ensure that the state vectors retain the same 
magnitude after transition as they had before transition. Each transition has 
the potentiality of changing both direction and magnitude of the occupancy 
state vector since according to (5.3), it must terminate on a hyperplane 

orthogonal to the state vector of equal likelihood both before and after 
transition. The only other restriction is the requirement that the state 
vectors lie in the first n-tant where all their components will be 



9 


non-negative. For a large number n of states the "angular separation" 6 of 
a particular state vector from the state vector of equal likelihood is easily 
estimated to be 

6 = arcos[( n I p 2 )“ 1/2 ] (5. ■M) 

k K 

The maximum possible angular separation between a state vector and the equal 

1/2 

likelihood state vector is given by 6 * arcos(l/n ). 

6. LINEAR MAPPING OF GRID NETWORK 

A GRID NETWORK is a network with nodes at all lattice points of a 
rectangular lattice with branches vertical or horizontal connecting these 
points and no others. The occupancy probabilities for the network nodes are 
the components of the STATE VECTOR in the Markov chain model of the system 
describing the progress of signal or fault through the network. Therefore, the 
state of the system is specified in terms of a one-dimensional array of nodes. 
In terms of sequential occupation of nodes in an actual two-dimensional 
network, it is more convenient to specify the nodes as a two-dimensional array. 
Without specifying the geometrical array of nodes, the sequential occupation of 
states in the Markov model will not have a unique relationship to the 
occupation of nodes in a given two-dimensional array. This comes about because 
there does not exist an a priori unique (mapping) correspondence between arrays 
of different dimensionality. 

In particular it is necessary to specify the correspondence between a 
rectangular grid of nodes at (i,j) with (i-1,2, . . .m) , (j=l,2,...n) and a linear 


array with (k«l ,2, . . . ,mn) . Some possible ways of constructing such a 
correspondence are illustrated in Fig. 1 below: 




Fig. 1 Grid/Linear Array Mappings 

The Markov model has no a priori explicit cognizance of the way the 
two-dimensional array is formed. Generally, it will be most convenient to 
specify the Markovian sequence of states by the (boustrophedon) path of (a), 
for which with n columns and m rows the original (one-dim array) Markov 
nodal number k is given in terms of the (i,j) mapped nodal coordinate by 

k = ni - ( -1 ) x j -(n-1 )/2 + (-l) 1 (n+l)/2 (6.1) 

for (i=l,2...m) and (j«l,2...n). The inverse mapping yielding (i,j) for a 
given value of k is found as follows. The row number i is given by 

i = [ (k-1 )/n] + 1 (6.2) 

where [] means "integer part of". Then j is given by 
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j = (-!)*( ni - k - (n-l)/2 ) + (n + l)/2 


( 6 . 3 ) 


In this way (with such a specified path) the nodal occupancy probabilities p k 

may be replaced by p^ where the location of the node (i,j) is specified in a 

two-dimensional grid. If p is the a priori absolute probability of 

ij t 

occupancy of the node at (i,j) at epoch t, then one can introduce 

uy a_ t = conditional probability of transit from (i,j) to (u,v) during the time 

interval (t,t+x) 

The Markov chain equation then becomes 


with 


p uv t+x .^.uv^jt p ij 
1 * J 


1 


• Z P 
i 


ijt 


for occupancy normalization and 


1 


= Z 
u,v 


a. . 

uv ijt 


( 6 . 4 ) 


( 6 . 5 ) 


( 6 . 6 ) 


as causality principle. 

The same equations can be more concisely represented by introducing the 
GAUSSIAN (complex) integers defined by 


g = i + j /-I 

1<i<m 

, 1<j<n 

f = i '+ jV-1 

1 < i ' <m , 

1< j ' <n Then one has 
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P f t + T g fV P gt 


with occupancy normalization 


1 = Z p 


gt 


and causality principle 


1 " f E fV 


(6.7) 


( 6 . 8 ) 


(6.9) 


with all quantities having complex indices. 

If it is desired to restrict to "nearest neighbor transitions" so that 
(i,j) to (i +1,j), (i-1,j), (i,j + 1), (i.j-D are the only transitions from 
(i,j) with non-null conditional probabilities of transition one can find these 
transitions in terms of the original k sequence. Thus, the transitions 
considered in terms of k become 


(i+1,j)s k = n (i + 1 ) + (-1)\j - (n-1 )/2 - 

(i-1,j): k = n(i-1) ♦ (-1) i J - (n-1)/2 - 

(ij + 1): k = ni - (-1 ) x (j + 1 ) - (n-1 )/2 + 

(i,j-1): k - ni - (-1) 1 (j-1 ) - (n- 1 )/2 + 


(-1 ) X (n+1 )/2 

(6.10) 

(-1) 1 (n+1)/2 

(6.11) 

(-1 ) x (n+1 )/2 

(6.12) 

(- 1 ) 1 (n+1 )/2 

(6.13) 


for the two-dimensional post-transition states indicated. 


7. THE DIFFUSION AND PROPAGATION OF FAULTS OR SIGNALS IN A NETWORK 


It will now be shown under what conditions it is possible to have a 
diffusive or wavelike propagation of successive faults or signals in a 
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network. In order to get such a propagation in a Markov model, it is necessary 
to impose a SELECTION RULE restricting to transitions between nearest neighbors 
and it is important to distinguish between chains obeying (3.2), (3.3) or 
(6.6), (6.8). Both are Markov models but the nearest neighbors are different 
in each. Propagation in the chain of Section 3 means propagation through a 
linear array of states while propagation in the chain of Section 6 means 
propagation through a two-dimensionally ordered set of states. Generally, it 
is not possible to get a wavelike propagation through the states in either case 
without imposing some restrictions on the transition probabilities of the 
general Markov chains of either section. 

For the case of a transition matrix independent of time the conditions for 
wave-like propagation can be readily adduced. 

The chain equation (3.2) and the causality principle (3.3) by imposition of the 

5 

selection rule 

|i - j | > 1 implies a =0 (7.1) 


become 


^iU+x ~ a i^i + 1 ^i + 1 ^t + a i ^ i — 1 ^i-1 ^t + a ii ^it 


(7.2) 


and 


l ii 


+ a 


i + 1 4 


+ a 


i-1 4 


= 1 


(7.3) 


The selection rule simply excludes transitions except among nearest neighbors. 
If h is the mean number of states through which a fault propagates during 
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transition time t, one can then define the quantities D i , w^y^^ as follows: 


2tD i h ( a i I i +1 a i'i-1 ) 


(7.4) 


TW i = h ( a i'i + 1 " a i i-1 } 


(7.5) 


‘i = ( a iJi ‘ a ili + 1 } + ( a i-1 li " a ili-1 } (7 ‘ 6) 


Using the so defined quantities and (7.3). the restricted chain equation (7.2) 
can be written in the form 


<p ilt.,- p it )A ' D i (p i..lt- 2p it* p i-ilt )/h2 * u i (p i.ilt" p i-ilt )/2h -Vn 


(7.7) 


which is a finite approximant of the diffusion equation with drift w and rate 
of destruction y (supposing (a i + 1 |. + a ._ 1 | i )>( a i | i+1 + 


3 t P = D 3 x p + w 8 x P - up 


(7.8) 


with diffusion coefficient D. This indicates that with transitions restricted 
to nearest neighbors faults may diffuse through the states, 
u may also be replaced by -y provided only that 

(a i | i + 1 + a i | i )>( a - +1 | A + a i _ 1 | A • Thus the expression given for y 
functions as a fault annihilator or creator. 

For the case where 


h<< * a il i -i ) 


(7.9) 


the diffusive term will become negligible with respect to the drift term and 
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the fault will propagate through the network in a wave-like fashion provided 
only that \i ^ is zero. The phase velocity is naturally 

If the individual states are arranged in a two-dimensional array rather 
than the one-dimensional linear array above, a regrouping of conditional 
probabilities to give a "diffusive case" or possibly a wave-like case can still 
be attained by imposing a selection rule on the transitions. However, it is 
most important to realize that in such a development the definition of "nearest 
neighbors" has changed and the analysis must take this into account. 

For the case where the transition matrix is a function of time, it is more 
convenient to return to the vector-matricial model of Section 5. Put the case 
that at some time the state vector from some index on has only null components 
(unoccupied states). The question is then posed as to what conditions the 
transition matrix must fulfill in order to advance the occupancy state by 
contiguous state as the transitions occur. If the index (component number) 
from which all previous components are not necessarily zero is q, then the qth 
and all later components are taken to be zero. In order that the transition 
matrix now accomplish the extension of occupancy to qth component of the state 
vector BUT NOT BEYOND it will be sufficient if all elements of the transition 
matrix with row numbers greater than q and column numbers less than q be 
null. Thus, it is readily grasped that not only is a wave of replacement of 
zeros propagating in the state vector but also a wave of zero replacements is 
simultaneously occurring in the transition matrix. Hence, it is seen that with 
the fulfillment of these conditions faults can propagate in a wave-like 
fashion, even in the case where the transition matrix is time dependent. An 
example of such a propagation is given below with the convention that 1 does 
not represent the unit but rather any non-null element. Then schematically one 


has 
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which indicates in a graphic way what is meant by "propagation" through the 
transition matrix simultaneous with the propagation through the states of the 
state vector. The propagation is that of a partition between null and non-null 
states and null and non-null transitions. Fig. 2 then corresponds to the 
propagation of nulls in the state vector with p k =0 for k>m-1 and in the 

transitions matrix with a..=0 for j>m>k. 

J * 

From this it is seen that the transitions must have a very particular type 
of time dependence (inhomogeneity) in order for propagation as such to occur. 

Regardless of whether it occurs or not, one can form useful estimates of 
the concentrati ve or dispersive effect of each transition by calculating the 
expected state and expected standard deviation in states after each transition. 
Thus , 


<k> = E k p kt (7.10) 

o k 2 = E k 2 p kt - < k > 2 (7.11) 

both of which are quite naturally time dependent. From the view point of the 
transition matrix to effect a concentration of the occupancies in the state 
vector on any particular transition the row vectors which form the transition 
matrix must be close to orthogonal to the state vector on which they operate 
except in a narrow range of row numbers (in the extreme case 1). On the 
contrary, if the transition matrix is to effect an equalization of the 
components of the state vector then the row vectors should all have the same 
scalar product with the state vector on which they operate. In either case 


(7.10) and (7.11) describe quantitatively the distribution of occupancy in 
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states. Another measure of how uniformly (or non-uni formly) states are 
distributed in the state vector is the Entropy defined by -Ep k In p k which 
takes the value In n for the state vector of equal likelihood and the value 
0 for the state vector of a system stochastically certain to be in a particular 
state. 

8. SEMI -MARKOV GENERALIZATION OF MARKOV CHAINS 

It has been pointed out in Section 2 that delay of faults and signals 
could under certain circumstances be treated by nodal insertions in the context 
of a Markov model. This requires delay times which are multiples of a common 
(constant) transition time. There is another method which is suited to 
continuous stochastically variable delay times. This is the method of Semi- 
Markov Chains. They are constructed around an "embedded" Markov Chain which 
may be taken to have a time independent transition matrix. 

Tau is now taken to be a continuous stochastic transition time and the 
following definitions apply: 

a_ = conditional probability of transition from j to i CONTINGENT upon 
the system having been in j (i.e., upon j having been occupied before the 
transition) . 

F..(t) = conditional probability of transition from j to i in a time 
^ J 

interval less than t CONTINGENT upon the transition from j to i having 
occurred. 

a^-r) = conditional probability of node i being occupied in time interval 
less than t CONTINGENT upon a transition from some node to i having 


occurred. 
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With these definitions we obtain from them the SEMI-MARKOV Chain equation 


a i (,) ■ J a u 


( 8 . 1 ) 


and the principle of causality 


1= Z a 


ij 


with a_>0 


( 8 . 2 ) 


If x is allowed to become infinite then F. . and a. should both become 

1 J 1 

unity. However, 


F 


ij 


(*) 


implies (°°) 


1 only if I a = 1 
j J 


(8.3) 


Hence, the transition matrix must be doubly stochastic. It is parenthetically 
noted that since reconfigurations which "restore" the condition of the network 
in some sense are being considered, this may very well be appropriate for the 
cases at hand. In any case 


F.j(O) = 0 implies a.(0) = 0 (8.H) 

so that no instantaneous transitions are allowed. 

The (Stieltjes) differential of both sides of (8.1) is then 

da i ( t ) = l a.j dF iJ (x) (8.5) 


and both differentials are clearly non-negative. The normalization consistent 
with (8.3) is 
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/ da (t) = 1 * / dF. .(x) (8.6) 

o o 1J 

so that the mean time interval to occupy the ith node (during the transition) 
is 


x ^ = / x da^(x) 
o 


( 8 . 7 ) 


and the mean transition time for the j to i (nodal) transition is defined 
to be 


T 


ij 


-St dF. . (x ) 
1 J 


so that the conclusion 


( 8 . 8 ) 


I a 


ij ij 


( 8 . 9 ) 


implies that the mean time interval required to occupy node i is a weighted 
average of the mean transition times into the node which is a consequence of 
the double stochasticity of the (embedded) transition matrix. The mean time 
interval to occupy all n states of the Semi-Markov chain is given by 


< x > = (1/n) Z x. (8.10) 

It should be noted by the reader that the entire formulation of the Semi- 
Markov chain is in terms of conditional probabilities. If it were desired to 
generalize the chain equations (3.2) involving absolute probabilities of 


occupancy one should have 
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Pi 


i 1 t+T 


F ij (T) 


ij 


P Jt 


( 8 . 11 ) 


with a clear understanding of the difference between universal clock time 
(epoch) t and stochastic transition time interval t. The difficulty with 
such an extension is that even if the occupancy probabilities are initially 
referenced to clock time they become functions of the stochastic transition 
times after any transitions introducing numerous new variables into the 
problems. 


9. STATISTICAL INDEPENDENCE OF STATES 

It is a tacit assumption of the Markov chain concept that the states must 
be defined so that they can be occupied independently and the same requirement 
applies in principle to the semi-Markov chain which contains an embedded Markov 
as part of its structure. In the semi-Markov chain the situation is even more 
severe with a sparse transition matrix because the consistent calculation of 
mean transition times requires that the transition matrix be doubly stochastic. 
The models^ of CARE III, SURE, HARP, etc., seem to overlook this fact and are 
therefore dealing with state definitions which are NOT INDEPENDENT. Because of 
this they should not be referred to as semi-Markov systems. This remark does 

not of itself invalidate the calculation of path transit probabilities made in 

7 8 

those systems either in the time domain' or in the frequency domain . 

10. COMMENTS ON VOTER SYSTEMS 

The voter system of n elements yields "agreement" for k failures among 
the n elements provided n-k>[n/2] ([] means integer part of). Otherwise 

the voter system yields "disagreement". It is a majority rule system. 
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The individual voter elements are however subject to malfunction hence the 

choice between operative and inoperative for the system as a whole can 

occasionally occur without reference to input. This would be the case of the 

"irrational voter" whose choices are entirely random. The probability of 

q 

agreement on such a random basis is 


[n/2] 
■ Z C 
k=0 


n-k k 

P q 


( 10 . 1 ) 


where C n k is the (binomial coefficient) number of combinations of n things 
taken k at a time, p is the probability of a YES vote by an individual 
element, q is the probability of a NO vote by an individual element. Thus 
p might be called the "probability of irrational agreement" (e.g., an 

3 . 

agreement to go to war when it serves no known national interest). In terms of 
the expected number of agreements N = 1 /p , expected number of YES votes 

3 3 

= 1 /p and expected number of NO votes N N = 1/q; one has from (10.1). 

N a ‘ E k-o C "k '< C » (,0 ' 2) 


The probability of agreement based upon rational factors is undoubtedly 
not binomial. Since the elements of the voter system are superficially 
identical, it seems they could be reasonably assumed to be equi correlated 
because of their common function but hardly independent. Their common design 
could apparently yield a correlative bias in performance. Thus, from the total 
expected number of agreements of the voter system should be subtracted the 
expected number of irrational agreements given by (10.2) to arrive at the 
expected number of rational agreements (i.e., agreements arrived at solely by 
mutual consideration of inputs). In future work modeling the correlation 
between individual voter elements should be of considerable importance. 
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1 1 . REDUNDANCY IN NETWORKS 

The principal device used to increase reliability of networks is BRANCH 
REDUNDANCY in which a branch with probability q of failure by itself is 
replaced by n parallel branches with probability of failure q n (on the 
assumption that the parallel branches fail independently). From this it can be 
readily calculated that the number of branches required to reduce the 
probability of the failure of the multi-branched system to 10 m is simply 

n = [ m/log 10 (1/q) ] (11.1) 

from which a small table may be constructed with the values of n in the body 
of the table and the values of q as vertically arrayed entries and the values 
of m as horizontally arrayed entries 


Table 3 


q 

.1 6 9 12 

.01 3 5 6 

.001 2 3 4 

6 9 12 m 


Branches with a greater probability of failure are also easily calculated from 

( 11 . 1 ). 

The calculation of failure probability of parallel multibranches is 
accomplished by successive application of the calculation for two branches say 


2H 


1 and 2 in parallel. Then the probability of failure of the double branch is 
simply where the "q"s are the probabilities of failure of individual 

branches. 

At this stage it is easy to determine the effect on the transition matrix 
of the uniquely branched network. Corresponding to the branch a^ this 
conditional probability must be replaced by the probability of the multi- 
branch . 

The question of NODAL REDUNDANCY would seem to imply replacing a single 
node by n nodes but this cannot be done without simultaneously multiplying 
all inputs and outputs for the node which considerably complicates the network. 
Apparently the use of a voter system is another way of handling the nodal 
redundancy problem. In that case the node complete with its treatment of 
inputs is replaced by a "new kind of node" capable of making its own decisions 
about how to treat inputs. 

12. NON -STATIONARY FAULT ARRIVAL RATE THEORY 

In view of the importance of fault arrival rates it seems worthwhile to 
attempt to construct a theory to handle this parameter under non-stationary 
conditions. As a first approximation this will be based upon two assumptions: 
Assumption 1 : The ratio R of reconfiguration rate to fault arrival rate u 

is a constant . 

Assumption 2: The ratio e of the absolute probability of the transition from 

operative system state to inoperative system state to the absolute probability 
of the transition from inoperative system state to operative system state is a 
constant . 

Two ways of the system becoming inoperative contingent upon its having 
been operative will be recognized. The system may become inoperative due to 
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internal malfunction quite independently of fault arrival or it may become 
inoperative due to fault arrival, in the latter case it will be reasonable to 
expect the effect to be proportional to the fault arrival rate u. The 
following definitions apply: 

b^ Q u = conditional probability of system becoming inoperative due to fault 
arrival contingent upon having been operative 
c iQ = conditional probability of system spontaneously becoming inoperative 
contingent upon having been operative 
0 Q ^ = conditional probability of system spontaneously becoming operative 
contingent upon having been inoperative 
b oi Ru = conc *itional probability of system becoming operative (due to 
reconf iguration capability) contingent upon having been inoperative 

p^ = absolute probability of system becoming operative from inoperative 

a io p o “ " " " " " inoperative from operative 

The term "astationarity parameter" will be used for e. Only for e = 1 
does stationarity obtain. The principle of astationarity 


a. p 
10 K o 


e a oi p i 


( 12 . 1 ) 


now replaces the stationarity condition. The conditional probabilities of 
transition may now be expressed in terms of the definitions above 


a. „ = b . „ u + c. 
10 10 10 


a . = b Ru + c . 
01 01 01 


( 12 . 2 ) 


( 12 . 3 ) 
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substituting these into (12.1) yields 


<b io u * °ic > p o ■ E (b oi Ru * °oi> p i 


(12. H) 


Solving this for u then yields 


u ■ (£ °oi p i - °io p o )/(b io p o - ER b oi p i> 


(12.5) 


If now an Ansatz such as 


= / dF(t) 
o 


( 12 . 6 ) 


p - / dF(t ) 
t 


(12.7) 


with 


/ dF(t ) = 1 (12.8) 

o 


is used so that the probability of being inoperative initially is taken to be 
zero as is the probability of being operative ultimately. It should be noted 
that F(t ) is not a distribution function or the occupancy probabilities would 
be constrained to be monotonic. In any case the fault arrival rate becomes 


t °° 

e c . / dF(t) - c. / dF(t) 
01 10 . 
o t 

u = 


b 


io 


/ dF(t ) 
t 


- e R b oi 


t 

I dF(t ) 
o 


(12.9) 
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With more special assumptions about the occupancy probabilities other forms of 
(12.5) become possible. If it were assumed that the relative probability of 
being inoperative to that of being operative became exponentially unlikely with 

increasing time (12.5) would become 

u = e K t / (b. e Xt - e R b .) (12.10) 

10 01 

under the assumptions c iQ = 0 and p q » 1/(1 + e ~^ fc ) and 

Pj^ * e * fc /(1 + e * fc ) so that pVp Q - e ^ and c Qi » Kt which may be fitted 

to data if the coefficients are constant. 

Finally some remarks about fault arrival should be made. In hardware 
faults don’t arrive at failure states, they arrive at devices. In software 
faults don't arrive at failure states, they arrive at nodes in flow charts. 

13. TRANSITION MATRIX CHARACTERIZATION FOR SOFTWARE ERRORS 

The principle problem of reliability for software appears to be the 
masking of errors concealed in a node of the flow chart which is not invoked 
during a particular sequence of runs. The basic requirement is then a way of 
comparing the system performance with utilization of this node versus the 
system performance in the avoidance of this node. As far as the transition 
matrix is concerned, removal of this node is equivalent to removing the row and 

column containing the node from the original transition matrix. Then using the 

two transition matrices one would calculate the probabilities of attaining the 
same end states (final instructions) for each of the matrices. The ratio of 
these probabilities would then yield a measure of the potential damage to the 
program in terms of relative performance times. 
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NOTES 


1 . The Riordan convention of calling a particular instant in time an epoch to 
distinguish it from a time interval will be followed here. See W. Feller: 

An Introduction to Probability Theory and its Applications, Vol . I, Wiley, NY, 
which also discusses discrete Markov chains. 

2. The vertical line in the subscript emphasizes the separation of two 
distinct variables in a subscript. 

3- A. Kyrala: Selection Rules, Causality and Unitarity in Statistical and 

Quantum Physics: Foundations of Physics, Vol. 4, No. 1, March 1974, p. 31-51. 

5. The large arrow means "implies”. 

6. See Appendix A. 

7. A. L. White: Upper and Lower Bounds for Semi-Markov Reliability Models of 

Reconf igurable Systems: NASA Contractor Report 172340, April 1984. 

A. L. White: Synthetic Bounds for Semi-Markov Reliability Models: NASA 

Contractor Report 178008. 

8. See Appendix B. 

9. Considering only YES agreements. There is a similar expression for NO 


agreements. 
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APPENDIX A 

ILLUSTRATION OF TRANSITION MATRIX FOR SURE STATES 


Corresponding to the SURE State Diagram shown below 



Relabeling the states from pairs of digits (the first being the number of voter 
elements corresponding to YES, the second being the number of voter elements 
corresponding to NO) to single digits indicated on the diagram one may 
construct the transition matrix as follows 
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1 

2 

final 3 
4 

states 5 
6 

7 

8 


1 2345678 initial states 

00000000 
a 21 0 0 0 0 0 0 0 

0 a 32 1 0 0 0 0 0 

0 a^O 0 0 0 0 0 

0 0 0 a ^0 0 0 0 

0000 a &5 1 0 0 

0 0 0 0 a„ c 0 0 0 

to 

0 0 0 0 0 0 a g7 1 


from which it can be readily discerned that the matrix is too sparse to fulfill 
the normalizations on rows and columns of the Semi-Markov chain, although the 
columns can sum to unity satisfying the causality condition of the Markov chain 
in the case where the transition times become constant. 


31 


APPENDIX B 


NOTE ON 

SEQUENTIAL PATH FAILURE PROBABILITIES 
by LAPLACE STIELTJES TRANSFORM 
by A. Kyrala 

In considering the transmission of signals or faults through a path 
consisting of bipolar subsections, it is well known that the output of any 
section is the convolution of the input to that section and the system function 
for the section. For a linear array of such sections the overall output will 
be given by a repeated convolution. For four filters in series one has 



Fig.l 


so that the successive convolutions are 

t 


y 3 (t) 
y 2 (t) 
y 1 (t) 

y(t) 


- f VV x(t- V dT i* 

o 

t 


- / 


o 



o 


W y 3 <t ' T 3 ) dT 3 
s 2 (i 2 ) y 2 u- T 2 > d * 2 


t 


= / s 1 (t 1 ) y 1 (t-T 1 ) dx 1 


( 1 ) 

(2) 

(3) 

(H) 
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which upon successive substitutions yields a four-fold multiple integral 


y(t) 



“ t 2" t 3 

(t^ Js^t^s^t^s^t^) x(t-r.j — x 2 ~ t 3 _ T 4 ) dtidt^dTgdTjj 

( 6 ) 


Instead of dealing with (6) as an expression from which output can be 
calculated one can use the Laplace-Stielt jes transform defined by 

CO 

Y(s) = / e" St dF y (t) (7) 

o 

where F y (t) is the distribution function for y(t). Using a similar notation 
for the other elements in Fig. 1 the transformed version of (1), (2), (3), and 
(4) become 


Y(s) -S 1 (s)Y 1 (s) (8) 

Y 1 (s) * S 2 (s) Y 2 (s) (9) 

Y 2 (s) * S 3 (s) Y 3 (s) (10) 

Y 3 (s) * S M (s) X(s) ( 11 ) 


Thus instead of (6) one arrives at the transform of the output simply by 
multiplication 

4 

Y(s) * JI S (a) X(s) 
k =1 k 


( 12 ) 
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In a similar way any number of elements in series can be treated. 

To determine the moments of output (or any intermediate stage), one simply 
differentiates (7) with respect to s and then lets s approach zero. Thus 

00 

Y (n) (0) = (-1) n ; t" dF (t) (13) 

o 

so that the MacLaurin series for Y(s) is then 
00 00 

Y (s ) = Z Y (n) (0) s n /n! = Z (-1 ) n <t n >s n /n! (14) 

n=0 n=0 


In particular 


<t> - / t dF y (t) = - Y ' ( 0) (15) 

o 

is the mean for the output and the standard deviation o t is given by 

o t 2 » Y ' ' (0) - [Y'(0)] 2 (16) 

It should be clearly understood that the elements s k (t), which are taken 
to be system functions in filter theory can in the present stochastic context 
be regarded as failure probability densities associated with subsections of the 


path. 
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